Conjugate Gradient (CG)

The Conjugate Gradient (CG) is an algorithm solving linear equation systems in an iterative manner. In this notebook it used to demonstrate de-blurring of an image suffering from optical defocus.



last update: 02/05/2022
Author







Christopher
Hahne, PhD

Data preparation

At the beginning we load an picture showing the Swiss alps which crisp high frequency features. To mimic optical defocus, we convolve the ground-truth image with the Point-Spread Function (PSF), which represents the impulse response of an objective lens and is typically used to model the interferrence of propagating light waves.

Optimization

Our goal in this notebook is to counteract the lens blur and retrieve image information by means of the Conjugate Gradient (CG) algorithm. The objective function in CG writes

$$ \text{arg min}_{\mathbf{x}} \, \frac{1}{2}\mathbf{x}^{\intercal}\mathbf{A}\mathbf{x}−\mathbf{b}^{\intercal}\mathbf{x}+\mathbf{c} $$

where $\mathbf{A}\in\mathbb{R}^{n\times n}$ is symmetric, positive-definite, $\mathbf{b}\in\mathbb{R}^n$ is the observation and $\mathbf{x}\in\mathbb{R}^n$ is the solution. The derivation can be found at the end of this notebook.

Conjugate Gradient Algorithm

For the vanilla version of the CG algorithm, the variable assignment at each update step $k$ shown hereafter. The gain $\alpha_k$ is a scalar given by

$$ \alpha_k = \frac{\mathbf{r}_k^{\intercal} \mathbf{r}_k}{\mathbf{d}_k^{\intercal} \mathbf{A}\mathbf{d}_k} $$

and used in the solution update $\mathbf{x}_{k+1}$, which writes

$$ \mathbf{x}_{k+1} = \mathbf{x}_k + \alpha_k \mathbf{d}_k $$

where $\mathbf{d}_k$ is the direction vector. The residuals $\mathbf{r}_{k}$ are computed via

$$ \mathbf{r}_{k+1} = \begin{cases} \mathbf{r}_k-\alpha_k\mathbf{A}\mathbf{d}_k & \quad \text{if } \mod(k, 10) \neq 0\\ \mathbf{b}-\mathbf{A}\mathbf{x}_k & \quad \text{otherwise} \end{cases} $$

which allows for actual residual inference every now and then to mitigate round-off erros by using the condition from $k$. The CG algorithm comes with another gain denoted by $\beta_k$

$$ \beta_{k+1} = \frac{\mathbf{r}_{k+1}^{\intercal} \mathbf{r}_{k+1}}{\mathbf{r}_{k}^{\intercal} \mathbf{r}_{k}} $$

which helps obtain the direction vector $\mathbf{d}_k$ by

$$ \mathbf{d}_{k+1} = \mathbf{r}_{k+1} \beta_k \mathbf{d}_k $$

using the residuals. An implementation of this procedure is provided below.

Results

Animation

Derivation

$$ f(\mathbf{x}) = \frac{1}{2}\mathbf{x}^{\intercal}\mathbf{A}\mathbf{x}−\mathbf{b}^{\intercal}\mathbf{x}+\mathbf{c} $$$$ \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}} = \frac{1}{2}\left(\mathbf{A}^{\intercal}+\mathbf{A}\right)\mathbf{x}−\mathbf{b} $$

which is compressed to

$$ \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}} = \mathbf{A}\mathbf{x}−\mathbf{b} $$

as $\mathbf{A}$ is symmetric such that $\mathbf{A}^{\intercal}+\mathbf{A} = 2\mathbf{A}$. If we set the gradient to zero, we obtain

$$ \mathbf{b} = \mathbf{A}\mathbf{x} \quad\quad \text{if} \quad \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}}=\mathbf{0} $$

such that we write the initial residuals $\mathbf{r}_0$ as

$$ \mathbf{r}_0 = \mathbf{b} - \mathbf{A}\mathbf{x}_0 $$